The goals / steps of this project are the following:
The code for this step is contained in the first 3 code cells of the IPython notebook called project5.
I read in all the vehicle and non-vehicle images and randomly selected one of each category and showed it below.
I created the features with the following methods: 1. Spatial binning of color. The images from training data are resized to 16x16 and converted into a vector 1. Create histograms of color. Color of an image can help us to distinguish between a car and non-car 2. Histogram of Oriented Gradient (HOG) - I used scikit-image hog() method to detect a car by looking at its edges. HOG computes the gradietns from blocks of cells and then create a histogram from these gradient values
I extracted and combined the feature vectors into one for each image and then normalized the features. Here’s an example of a car image and its features.
I tried different combination of parameters and trained a model to get its accuracy to compare and finally I settled with the values below because it gave me over 98% accuracy.
For Hog:
For spatial binning and create histograms of color:
I trained a linear SVM using HOG and Colour features combined. First I normalized the training data and then randomly shuffled and split them into 80% training 20% test sets. This code can be found in code cell #4 of the notebook.
The code for this is in cell #5 of the notebook. I restricted the search space to lower half of the image to capture the road instead of the sky and I used two different scales. Each one has different x_start_stop and y_start_stop and the overlap is set to .75. For each of the sliding window search: - extract features for that window - scale extracted features - feed extracted features to classifier - predict if the window contains a car. If yes and svc.decision_function(X) > threshold then I add that to the window list and return this list at the end
Below is an image with the windows drawn.
I played around with diffrerent parameters settings to extract the features. I used YCrCb 3-channel HOG features plus spatially binned color and histograms of color in the feature vector. When doing sliding window search, I restricted it to the lower half of the picture so it only focused on the road vs. the sky.
Here are some example images:
Here’s a link to my video result
To reduce the false positives and negatives I combined the heat maps across the last 5 series of frames and then thresholded the combined heat map with threshold=4. After that I used scipy.ndimage.measurements.label on this combined heat map to identify the vehicle positions and turned many overlapped windows into one bounding box for each vehicle. I also used svc.decision_function(X) > threshold to get a more confident prediction to ensure that only high confidence predictions are considered as vehicle detections. The code for this is located in cell #7 in the notebook.
I spent a lot of times playing with the window settings together with the heatmap threshold in order to eliminate most of the false positive.
Currently the pipeline works relatively well. However, it can be improved more by making the window search/feature engineering to run more efficient. Also, I will work further eliminating the false positive and negative by keeping track of where the car was from the previous frames and used it to predict if it should be there in the current frame. There are some frames that the current pipeline not able to detect the car. I’ll try to improve on that as well.
There are many potential point of failures with the current pipeline if the video contains the followings: - Non car objects (ie: pedestrians) - The road shows 2 ways street with cars going the opposite direction - The frame is pixelated